Geometric Rectification of Camera-Captured Document Images
Identifieur interne : 000D16 ( Main/Exploration ); précédent : 000D15; suivant : 000D17Geometric Rectification of Camera-Captured Document Images
Auteurs : JIAN LIANG [États-Unis] ; Daniel Dementhon [États-Unis] ; David Doermann [États-Unis]Source :
- IEEE transactions on pattern analysis and machine intelligence [ 0162-8828 ] ; 2008.
Descripteurs français
- Pascal (Inist)
- Intelligence artificielle, Analyse forme, Reconnaissance optique caractère, Reconnaissance caractère, Image tridimensionnelle, Texture, Traitement flux donnée, Caméra vidéo, Courbure, Analyse documentaire, Analyse texture, Rectification, Scanneur, Projection perspective, Frontal, Métrique, Etalonnage.
- Wicri :
- topic : Intelligence artificielle.
English descriptors
- KwdEn :
Abstract
Compared to typical scanners, handheld cameras offer convenient, flexible, portable, and noncontact image capture, which enables many new applications and breathes new life into existing ones. However, camera-captured documents may suffer from distortions caused by a nonplanar document shape and perspective projection, which lead to the failure of current optical character recognition (OCR) technologies. We present a geometric rectification framework for restoring the frontal-flat view of a document from a single camera-captured image. Our approach estimates the 3D document shape from texture flow information obtained directly from the image without requiring additional 3D/metric data or prior camera calibration. Our framework provides a unified solution for both planar and curved documents and can be applied in many, especially mobile, camera-based document analysis applications. Experiments show that our method produces results that are significantly more OCR compatible than the original images.
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream PascalFrancis, to step Corpus: 000287
- to stream PascalFrancis, to step Curation: 000497
- to stream PascalFrancis, to step Checkpoint: 000238
- to stream Main, to step Merge: 000D28
- to stream Main, to step Curation: 000D16
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Geometric Rectification of Camera-Captured Document Images</title>
<author><name sortKey="Jian Liang" sort="Jian Liang" uniqKey="Jian Liang" last="Jian Liang">JIAN LIANG</name>
<affiliation wicri:level="2"><inist:fA14 i1="01"><s1>Amazon.com, 701 5th Ave. #614.B</s1>
<s2>Seattle, WA 98104</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Washington (État)</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Dementhon, Daniel" sort="Dementhon, Daniel" uniqKey="Dementhon D" first="Daniel" last="Dementhon">Daniel Dementhon</name>
<affiliation wicri:level="4"><inist:fA14 i1="02"><s1>Institute for Advanced Computer Studies, University of Maryland, 3449 AV Williams Building</s1>
<s2>College Park, MD 20742</s2>
<s3>USA</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Maryland</region>
<settlement type="city">College Park (Maryland)</settlement>
</placeName>
<orgName type="university">Université du Maryland</orgName>
</affiliation>
</author>
<author><name sortKey="Doermann, David" sort="Doermann, David" uniqKey="Doermann D" first="David" last="Doermann">David Doermann</name>
<affiliation wicri:level="4"><inist:fA14 i1="03"><s1>Laboratory for Language and Media Processing, Institute for Advanced Computer Studies, University of Maryland, 3451 AV Williams Building</s1>
<s2>College Park, MD 20742</s2>
<s3>USA</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Maryland</region>
<settlement type="city">College Park (Maryland)</settlement>
</placeName>
<orgName type="university">Université du Maryland</orgName>
<placeName><settlement type="city">College Park (Maryland)</settlement>
<region type="state">Maryland</region>
</placeName>
<orgName type="university" n="3">Université du Maryland</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">08-0175226</idno>
<date when="2008">2008</date>
<idno type="stanalyst">PASCAL 08-0175226 INIST</idno>
<idno type="RBID">Pascal:08-0175226</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000287</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000497</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000238</idno>
<idno type="wicri:doubleKey">0162-8828:2008:Jian Liang:geometric:rectification:of</idno>
<idno type="wicri:Area/Main/Merge">000D28</idno>
<idno type="wicri:Area/Main/Curation">000D16</idno>
<idno type="wicri:Area/Main/Exploration">000D16</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Geometric Rectification of Camera-Captured Document Images</title>
<author><name sortKey="Jian Liang" sort="Jian Liang" uniqKey="Jian Liang" last="Jian Liang">JIAN LIANG</name>
<affiliation wicri:level="2"><inist:fA14 i1="01"><s1>Amazon.com, 701 5th Ave. #614.B</s1>
<s2>Seattle, WA 98104</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Washington (État)</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Dementhon, Daniel" sort="Dementhon, Daniel" uniqKey="Dementhon D" first="Daniel" last="Dementhon">Daniel Dementhon</name>
<affiliation wicri:level="4"><inist:fA14 i1="02"><s1>Institute for Advanced Computer Studies, University of Maryland, 3449 AV Williams Building</s1>
<s2>College Park, MD 20742</s2>
<s3>USA</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Maryland</region>
<settlement type="city">College Park (Maryland)</settlement>
</placeName>
<orgName type="university">Université du Maryland</orgName>
</affiliation>
</author>
<author><name sortKey="Doermann, David" sort="Doermann, David" uniqKey="Doermann D" first="David" last="Doermann">David Doermann</name>
<affiliation wicri:level="4"><inist:fA14 i1="03"><s1>Laboratory for Language and Media Processing, Institute for Advanced Computer Studies, University of Maryland, 3451 AV Williams Building</s1>
<s2>College Park, MD 20742</s2>
<s3>USA</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName><region type="state">Maryland</region>
<settlement type="city">College Park (Maryland)</settlement>
</placeName>
<orgName type="university">Université du Maryland</orgName>
<placeName><settlement type="city">College Park (Maryland)</settlement>
<region type="state">Maryland</region>
</placeName>
<orgName type="university" n="3">Université du Maryland</orgName>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">IEEE transactions on pattern analysis and machine intelligence</title>
<title level="j" type="abbreviated">IEEE trans. pattern anal. mach. intell.</title>
<idno type="ISSN">0162-8828</idno>
<imprint><date when="2008">2008</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">IEEE transactions on pattern analysis and machine intelligence</title>
<title level="j" type="abbreviated">IEEE trans. pattern anal. mach. intell.</title>
<idno type="ISSN">0162-8828</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Artificial intelligence</term>
<term>Calibration</term>
<term>Character recognition</term>
<term>Curvature</term>
<term>Data flow processing</term>
<term>Document analysis</term>
<term>Frontal</term>
<term>Metric</term>
<term>Optical character recognition</term>
<term>Pattern analysis</term>
<term>Perspective projection</term>
<term>Rectification</term>
<term>Scanner</term>
<term>Texture</term>
<term>Texture analysis</term>
<term>Tridimensional image</term>
<term>Video cameras</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Intelligence artificielle</term>
<term>Analyse forme</term>
<term>Reconnaissance optique caractère</term>
<term>Reconnaissance caractère</term>
<term>Image tridimensionnelle</term>
<term>Texture</term>
<term>Traitement flux donnée</term>
<term>Caméra vidéo</term>
<term>Courbure</term>
<term>Analyse documentaire</term>
<term>Analyse texture</term>
<term>Rectification</term>
<term>Scanneur</term>
<term>Projection perspective</term>
<term>Frontal</term>
<term>Métrique</term>
<term>Etalonnage</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Intelligence artificielle</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Compared to typical scanners, handheld cameras offer convenient, flexible, portable, and noncontact image capture, which enables many new applications and breathes new life into existing ones. However, camera-captured documents may suffer from distortions caused by a nonplanar document shape and perspective projection, which lead to the failure of current optical character recognition (OCR) technologies. We present a geometric rectification framework for restoring the frontal-flat view of a document from a single camera-captured image. Our approach estimates the 3D document shape from texture flow information obtained directly from the image without requiring additional 3D/metric data or prior camera calibration. Our framework provides a unified solution for both planar and curved documents and can be applied in many, especially mobile, camera-based document analysis applications. Experiments show that our method produces results that are significantly more OCR compatible than the original images.</div>
</front>
</TEI>
<affiliations><list><country><li>États-Unis</li>
</country>
<region><li>Maryland</li>
<li>Washington (État)</li>
</region>
<settlement><li>College Park (Maryland)</li>
</settlement>
<orgName><li>Université du Maryland</li>
</orgName>
</list>
<tree><country name="États-Unis"><region name="Washington (État)"><name sortKey="Jian Liang" sort="Jian Liang" uniqKey="Jian Liang" last="Jian Liang">JIAN LIANG</name>
</region>
<name sortKey="Dementhon, Daniel" sort="Dementhon, Daniel" uniqKey="Dementhon D" first="Daniel" last="Dementhon">Daniel Dementhon</name>
<name sortKey="Doermann, David" sort="Doermann, David" uniqKey="Doermann D" first="David" last="Doermann">David Doermann</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000D16 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000D16 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= Pascal:08-0175226 |texte= Geometric Rectification of Camera-Captured Document Images }}
This area was generated with Dilib version V0.6.32. |